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ABSTRACT 

This paper considers filters (the Mexican hat wavelet, the matched and the scale- 
adaptive filters) that optimize the detection/separation of point sources on a back- 
ground. We make a one-dimensional treatment, we assume that the sources have a 
Gaussian profile, i. e. t{x) = e~'^ , and a background modelled by an homo- 
geneous and isotropic Gaussian random field, characterised by a power spectrum 
P{q) oc q^'',j > 0. Local peak detection is used after filtering. Then, the Neyman- 
Pearson criterion is used to define the confidence level for detections and a comparison 
of filters is done based on the number of spurious and true detections. We have per- 
formed numerical simulations to test theoretical ideas and conclude that the results 
of the simulations agree with the analytical results. 
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1 INTRODUCTION 

Tlie detection of localized signals or features on one- 
dimensional (ID) or two-dimensional images (2D) is one of 
the most challenging aspects of image analysis. 

We are interested in the detection of compact sources 
(signal) embedded in a background (ID case). We assume 
that the profile of the source and the statistical properties 
of the background are known. Linear filtering of the data in 
order to eliminate partially the background is the primary 
goal. Several filters have been introduced in the literature to 
deal with the problem. Four examples are: the continuous 
'Mexican hat' wavelet, difference of two Gaussians, matched 
filters and scale-adaptive filters. The two first cases are filters 
given 'a priori', adapted to the detection of point sources, 
whereas the matched filter is constructed taking into ac- 
count the profile and background in order to get the maxi- 
mum SNR at the source position. The scale-adaptive filter is 
constructed taking into account the previous properties and 
also the constrain to have a maximum in filtered space at 
the scale and source position. Hereinafter, we will identify 
the position of the possible sources with the local maxima. 

The 'Mexican hat' wavelet used as a filter has been ex- 
tensively applied during the last years to analyse optical, X- 
ray and microwave data. Optical images of galaxy fields have 
been analysed to detect voids and high-density structures 
in the first CfA redshift survey slice (Slezak et al. 1993). 
Microwave images have been analysed (Cayon et al. 2000; 
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Vielva et al. 2001a) and combined with the maximum en- 
tropy method (Vielva et al. 2001b) to obtain catalogues of 
point sources from simulated maps at different frequencies 
that will be observed by the future Planck mission. On the 
other hand, the Mexican hat has also been used to detect 
X-ray sources (Damiani et al. 1997) and presently for the 
on-going XMM-Newton mission (Valtchanov et al. 2001). 

Other useful filters include the so-called 'matched' fil- 
ters. They optimize the signal-to-noise ratio and have been 
used mainly in signal processing. The generalization to two 
dimensions of the previous matched filters have been used 
recently to detect clusters of galaxies from optical imag- 
ing data (Postman et al. 1996; Kawasaki et al. 1998). In 
this approach the method uses galaxy positions, magnitudes 
(and photometric/spectroscopic redshifts if available) to find 
clusters and determine their redshift. Also they have been 
applied to microwave maps to detect clusters through the 
Sunyaev-Zeldovich effect, either on single or multifrequency 
maps (Herranz et al. 2002a, b) 

Other remarkable filters that have been recently (Sanz 
et al. 2001) introduced in the astrophysics literature are the 
so-called scale-adaptive filters. Applications have been done 
to the detection of point sources in time-ordered data in 
the future Planck mission (Herranz et al. 2002c), detection 
of SZ-emission in single microwave maps and X-ray emis- 
sion from clusters of galaxies in single X-ray maps (Herranz 
et al. 2002a). More recently, application for the detection 
of SZ-clusters in multifrequency maps representing the fu- 
ture observations by the Planck mission has been considered 
(Herranz et al. 2002b). 
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One interesting question relative to all of these filters 
is the optimality, that we define in terms of the following 
properties: confidence level of the detections, number of spu- 
rious sources which emerge in the process and number of real 
sources detected (detection limit and complctitudo magni- 
tude). As we will see in this paper, the previous properties 
are not only related to the SNR gained in the filtering pro- 
cess but depend on the filtered momenta to 4th-order, the 
curvature of the source and the amplification in the ID case. 
The combination of these quantities in a complicated way 
makes the decision on one filter totally dependent on the 
source profile and the background. This shows that, in addi- 
tion to the amplification, one must take into account other 
quantities which also play an important role. The amplifica- 
tion was suggested by Vio ot al. (2002) in order to compare 
filters. The final identification of the sources with a simple 
TMT thresholding is not the whole story regarding optimality. 
Moreover, if the scale of the source needs to be estimated 
from the data, it must be identified 'a posteriori' on the SNR 
map when using matched filters, what introduces noise in 
the process. On the contrary, the adaptive filter allows one 
to get the sources straightforwardly on the filtered map. 

In section 2, we introduce two useful quantities: num- 
ber of maxima in a Gaussian background and define the 
detection problem. In section 3, we remark on the optimal 
statistic that allows one to define the region of acceptance, 
i. e. the confidence level of the detections. In section 4, we 
comment on the different filters to be compared (Mexican 
hat, matched and scale-adaptive filters). In section 5, we 
give some analytical and numerical results. In section 6, we 
describe the numerical simulations performed to test some 
theoretical aspects and give the main results. Finally, in sec- 
tion 7, we summarize the main results and applications of 
this paper. 



2 LOCAL PEAK DETECTION 

Let us assume a ID background (e. g. one-dimensional 
scan on the celestial sphere or time ordered data set) 
represented by a Gaussian random field £,{x) with aver- 
age value {^{x)} — and power spectrum P(q),q = \Q\: 
(^(Q)r(<3')> = -P(9)fo(9 - a')> €(Q) is the Fourier trans- 
form of ^{x) and 6d is the ID Dirac distribution. The dis- 
tribution of maxima was studied by Rice (1954) in a pioneer 
article, the expected number density of maxima per intervals 
(a;, X + dx), {v, u + dv) and {k, k + dn) is given by 



rib 



^(i/^-H«^-2pi'K) 



(1) 
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where h' G (— cx),cxj) and k G (0, oo) represent the normal- 
ized field and curvature, respectively, and rib is the expected 
total number density of maxima (i. e. number of maxima per 
unit interval dx). is the moment of order 2n associated 
to the field. 

If the original field is linear-filtered with a circularly- 
symmetric filter '^{x;R,b), dependent on 2 parameters {R 
defines a scaling whereas b defines a translation) 



*(.;E,6) = 14^ 
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we define the filtered field as 
w{R,b) = J dx^{x)^{x;R,b). 

Then, the moment of order ri of the linearly-filtered field is 

poo 

al = 2 dqq^"Piq)i;\Rq), (4) 
Jo 

being P{q) the power spectrum of the unfiltered field and 
ip{Rq) the Fourier transform of the circularly-symmetric lin- 
ear filter. 

Now, let us consider a Gaussian source (i. e. profile given 

by t{x) = ) embedded in the previous background. 

Then, the expected number density of maxima per intervals 

{x,x + dx), {v,!/ + dv) and {k,k -\- dn), given a source of 

amplitude A in such spatial interval, is given by 

/ I \ rib 1^ 

n[u,ti\vs) = 



e 2(1-"^) , (5) 

where v G (— cx),cx)) and k. G (O. oo), Vs = A/ao is the nor- 
malized amplitude of the source and Ks = —At'^/(J2 is the 
normalized curvature of the filtered source. The last expres- 
sion can be obtained as 



n2 II 



Jo 



dqq\{q)i>{Rq). (6) 



We consider that the filter is normalized such that the 
amplitude of the source is the same after linear filtering: 
J dxT{x)'i/{x;R,b) = 1. 

The total number density ri in the presence of a local 
source is obtained integrating eqn. (5) 



n = ribe 2 



B{x) = V^xe'' erfc{-x).{7) 



3 OPTIMAL STATISTICS 

We want to make a decision between filters based on 
optimality. We will assume that it includes the following 
properties: a) confidence level of the detections, b) num- 
ber of spurious sources which emerge in the process and c) 
number of real sources detected. As we will see, the previ- 
ous properties are not only related to the SNR gained in 
the filtering process but depend on the filtered momenta to 
4th-order, the curvature of the source and the amplification 
(ID case). 

We will distinguish two cases: I) Simple detection, con- 
sisting of detecting the presence of a signal s{x) in a back- 
ground and II) Simple measurement, consisting of detecting 
the signal s{x) = At{x) and measuring its unknown ampli- 
tude A (we assume the profile is given) in the presence of a 
background. 



3.1 Simple detection 

First, let us establish the confidence level of the detections 
assuming simple detection (the amplitude of a source A is 
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given and so we can calculate Vs ) : lot us consider a local peak 
in the ID data set characterised by the normalized ampli- 
tude and curvature (i^, k), if Ho : p.d.f. p{u, K\va) represents 
the null hypothesis that the peak is a source with normalized 
amplitude i^s given the data {i>,n), and H\ : p.d.f.p[i',hi\Q) 
represents the alternative hypothesis that the peak is a raax.- 
imum of the background, we can associate to any region 
Rt, {v, k) two errors 



dv dKp{h',K\Q), 



J R, 



dv dKp{l',K\Vs 



(8) 



a is the false alarm probability or confidence level of the 
detection (i. e. it represents the probability of interpreting 
noise as signal) whereas (3 is the false dismissal probability 
or equivalently, 1 — /? is the power of the detection (i. e. /9 
represents the probability of interpreting signal as noise). 
Rt is called the acceptance region. 

Clearly, the previous probabilities can be obtained from 
the number density of maxima given by eqns. (1, 5, 7) as 



Pi,(z/,K) = p(z/,/t|0) 



nb{v,K) 



p{v,k\vs) = J , (9) 



where n and are the total number density in the presence 
or absence of a local source, respectively. 

We will assume as decision rule the Neyman-Pearson 
one (CI): the acceptance region Ji« giving the highest power 
1 — /3, for a given confidence level a, is the region 



L{v,k\Vs) = tt:t^ > L» 



(10) 



p(i^,k;|0) 

where L* is a constant. So, the decision rule is expressed by 
the likelihood ratio: if L > L* the signal is present, whereas 
li L < L, the signal is absent. Once we have assumed the 
previous decision rule for simple detection, one can calculate 
the ROC-curves ('receiver operating curves' in the signal 
processing jargon): 3 = 0{(t, i^,,). 

Now, we will introduce the significance s(i^s) of the de- 
tection 

2 _ [{N)s^ar.al - {N) 

no— signal} 



signal 



(11) 



no — signal 



where in the numerator appears the difference between the 
mean number of peaks in the presence and absence of sig- 
nal: (1 — P)N and aN, respectively. In the denominator ap- 
pear the variances in the presence and absence of signal: 
/3(1 — P)N and q(1 — a)N, respectively. The last quantities 
have been calculated taken into account that, in the absence 
of signal, the probability of detecting locally peaks in q of 
A'^ realisations of the background is given by the binomial 
' iV ' 



distribution: Pa 



a''(l-a)^-«. Therefore, the sig- 



nificance is given by the function 
(l-/J-a)^ 



s (a) oc 



/3(1 - /?) «(! - a) ' 



(12) 



The next step is to assume a criterion to define the optimal 
confidence level (C2): maximizing s respect to a (or equivar 
lently L,). In this sense, for each concrete filter we are able 
to get the best conditions (i. e. the maximum power cor- 
responding to the optimal confidence level in the sense of 
maximizing the significance of the detection, see Allen et al. 
2002) for the filtered data to be analysed. Clearly, for each 



filter we have found a unique region of acceptance R, given 
the amplitude of the source A. 



3.2 Simple measurement 

In this case the signal has an unknown parameter, the am- 
plitude, that is measured. First, we are interested in the 
detection of the signal and then in the estimation of its am- 
plitude. Therefore, we calculate the a posteriori p.d.f. of all 
the possible values of the parameter A given the data (z/, k). 
In absence of any a priori information about the p.d.f. p{A) 
we will assume uniformity in the interval [0, Vc], i. e. we can 
integrate over Us: p{v,k) = J^" diyg p^v, kIus). With this 
new p.d.f., the likelihood ratio is defined as 



L(z/, k) 



dl'sL{l',K\Vs), 



(13) 



We express the decision rule giving the region of acceptance 
R» as (criterion CI) 



R» : L{i', k) > I/*, 



(14) 



where L* is a constant. Once we have assumed the previ- 
ous decision rule for simple measurement, one can calculate 
the ROC-curves = /3(a), where one must use Pb{v, n) and 
p{i', k), respectively. By introducing the significance s of the 
detection by eqn. (11), we adopt the same C2 criterion: max- 
imizing s respect to a, to obtain the optimal confidence level. 
Moreover, the number of spurious (corresponding to 
background fiuctuations) sources and real detections 
n* are obtained by integration of Ub and n(i^, k) = 
Jq" dvsn{i', k\vs) in the region of acceptance J?,. 
On the other hand, regarding the estimation of the am- 
plitude A, it is natural to define the most probable value 
of A (i.e. the amplitude A corresponding to the value 
where the likelihood L{v, k\vs) takes its maximum) as being 
the result of the measurement and consider this value as the 
measured value of A. 



4 THE FILTERS 

4.1 The scale-adaptive filter (SAF) 

The idea of a scale-adaptive filter (or optimal pseudo- 
filter) has been recently introduced by the authors (Sanz 
et al. 2001). By introducing a circularly-symmetric filter, 

^{x; R, b), we arc going to express the conditions in order to 
obtain a scale-adaptive filter for the detection of the source 
s{x) at the origin taking into account the fact that the source 
is characterised by a single scale Ro- The following condi- 
tions are assumed: (l)(w(/?o, 0)) = s(0) = A, i. e. w(i?o,0) 
is an unbiased estimator of the amplitude of the source. (2) 
the variance of w{R,b) has a minimum at the scale Ro, i. e. 
it is an efficient estimator. (3) w{R, b) has a maximum at 
{Ro,0). Then, the filter satisfying these conditions is given 
by the equation 



V(9) = V(-Rog) = 



Tiq) 



ac - 62 p{q) 



nb + c — (no -|- b) 



dluT 
dlnq 



,(15) 



T dr 
'^'^'^Pd-q^ 



dr' 
P\dq, 



.(16) 



© 0000 RAS, MNRAS 000, 000-000 



4 Barreiro et al. 




12 3 4 

q 




12 3 4 

q 




7=0 



12 3 4 

q 




0.6 - 

0.5 1 1.5 2 2.5 

7 



1.2 r 

1 r 




0.5 1 1.5 2 2.5 

7 



Figure 1. Scale-adaptive filter (solid line), matched filter (dotted 
line) and Mexican Hat wavelet (dashed line) in Fourier space for 
the cases 7 = 0, 7 = 1 and 7 = 2. The scale parameter R for 
the filters is taken to be J? = 1. Note that for 7 = 1 the adaptive 
and matched filter coincide, whereas the matched filter and the 
Mexican Hat wavelet are equal for 7 = 2. 



where r is the profile of the source in Fourier space {s{x) = 
At(x)). Generically, ^ is not positive (i. e. the name 
pseudo — filter) and docs not define a continuous wavelet 
transform. Moreover, the filter adapts to the source profile, 
the background and the scale of the source, i. e. the name 
adaptive filter. 

The previous equations have been used by the authors 
to obtain the adaptive filter for a Gaussian and an expo- 
nential profile (Sanz et al. 2001) and a multiquadric profile 
(Herranz et al. 2002a,b). The previous SAF has been re- 
cently used by Herranz et al. (2002c) for point source de- 
tection and extraction from simulated Planck time-ordered 
data. 

Assuming a scale- free power spectrum, P{q) = Dq~'^ , 
and a Gaussian profile for the source, the previous set of 
equations lead to the filter 

1 -i„2 r t 2I 
1 - t H q 



i>a{q) = 



r(m) 
1 + 7 ^ 



q' e 



(17) 



■7 



In Fig.l appears the SAF for different values of the spectral 
index 7 = 0, 1, 2. In this case the filter parameters (3 and p 
and the curvature of the source j/s are given by 



3^ = J^^ 



1+'- 



0" 



^(1 + -) [1 + 1^ + ^] 



tP^ _L 2t(2+t) 1 ' J12' 
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- il - 


2t 


1 m 
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1 + m 



(18) 



\ m J \ m ) 

In Fig. 2 appear the parameters /J^/p, p and j/s as a function 



Figure 2. fiy/p, p and ys as a function of the spectral index 7 for 
the SAF (solid line), MF (dotted line) and MHW (dashed line). 



of the spectral index 7 for the SAF. 

The identification of sources as peaks above a certain 
threshold (e. g. Scro) in filter space gives a low probability of 
false detections (reliability) because if the background has 
a characteristic scale of variation different from the sources, 
then practically everything detected with our method is real. 

For comparison with the filter developed in this subsec- 
tion, we shall briefly introduce other couple of filters that 
have been extensively used in the literature: the Mexican 
hat wavelet and the matched filter. 



4.2 The Mexican Hat wavelet (MHW) 

The MHW on K is defined to be proportional to the Laplar 
cian of the Gaussian function: iph{x) oc (1 — x^)e~'^ Thus, 
in Fourier space 

2 2 -i„^ (^g^ 



In Fig. 1 appears the MHW compared to other filters. 

In this case the filter parameters /3 and p and the cur- 
vature of the source j/s are given by 



^{2+t){3+ty 



3 + t' 



ys 



2R?' 



(20) 



The parameters 0\fp, p and ys as a function of the spectral 
index 7 are given in Fig. 2 for the MHW. 
Wo comment that the generalization of this type of wavelet 
for two dimensions has been extensively used for point 
source detection in 2D images. 

It should be noted that the MHW studied in this work 
is constructed at a fixed scale R given by the profile of 
the source. This differs from the filter used in Viclva et al 
(2001a,b) where an optimal scale is determined from the 
data and used to construct the MHW. This optimal scale is 
chosen to give maximum amplification of the source and thus 
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the MHW at the optimal scale (MHWo) will give a higher 
gain than the MHW at the scale of the source. Moreover, 
Vielva at al. perform a multiscale fit in order to estimate the 
amplitude of the source. 



4.3 The matched Filter (MF) 

If one removes condition (3) defining the SAF in the previ- 
ous section, it is not difHcult to find another type of filter 
after minimization of the variance (condition (2)) with the 
constrain (1) 



1 T(g) 



(21) 



2a P{q) ' 

This will be called matched filter as is usual in the literature. 
Note that in general the matched and adaptive filters are 

different . 

For the case of a Gaussian profile for the source and a 
scale- free power spectrum given by P{q) oc q~'' , the previous 
formula leads to the following matched filter 

1 -v _ 1 +7 



i'miq) 



Tim 



(22) 



In Fig. 1 appears the MF for different values of the spectral 
index 7 = 0, 1, 2. We remark that for 7 = 1 the adaptive 
filter and the matched filter coincide and for 7 = 2 the 
matched filter and the Mexican Hat wavelet are equal. 

For the MF the parameters /3 and p and the curvature 
of the source ys are given by 



P = 



m 



1+m' 



(23) 



In Fig. 2 appear the parameters /3y^, p and ys as a function 
of the spectral index 7 for the MF. 



5 ANALYTICAL AND NUMERICAL RESULTS 

Wc will distinguish the two cases of simple detection and 
simple measurement in the applications that follow. 



5.1 Simple detection 

In this case, we try to detect sources of known amplitude 
A. We note that the same amplitude A translates into dif- 
ferent thresholds Vs,! ~ Ajoi (where % refers to the SAF, 
MF or MHW) in the filtered map for the three considered 
filters, since they have different amplifications. The relation 
between the values of Vg for each filter can be easily ob- 
tained taking into account their relative amplification, which 
is given in Fig. 3. In the following comparison, we will con- 
sider the same amplitude of the source for all the filters and 
its value will be given as a function of the dispersion of the 
map filtered by the SAF. 

Using eqns. (9,10) one obtains for the likelihood 

L(y, K\Va) 



l-hS 



^/2 



pvo^. I y±_ 



^ - (1 - py^) 



(24) 



(25) 




Figure 3. Relative amplification given by the MF (dotted line) 
and MHW (short-dashed line) with respect to the SAF (solid 
line) versus the spectral index 7. For comparison, the amplifica- 
tion given by the MHWo (long-dashed line) is also shown. The 

MF gives always a higher gain because it is constructed imposing 
maximum amplification of the source. 



0,1 0,2 0.3 0,4 0,5 




0,1 0,2 0,3 0,4 0,5 



Figure 4. Value of the significance versus L« and a for 7 = 
(top), 1.4 (middle) and 2.2 (bottom) for the simple detection case 
with a source amplitude corresponding to v^^saf = 3. The SAF, 
MF and MHW are given by the solid, dotted and dashed lines 
respectively. Note that the maximum of is obtained in all cases 
at X/* ~ 1. 



The region of acceptance R* is defined by L > L* or equiv- 
alently ip > ip»{i'a) (we assume fs > 0) with 



[l + B 



V2 



Clearly, the last constraint can be rewritten 



R» : V > i'«(/t; i^t), 



P-Vs 



1 



PVs 



1-p^ 



PVs 



■<P* 



(26) 



(27) 



In order to find the acceptance region, we need to maximize 
the significance with respect to a (or equivalently L,) for 
each filter. Fig. 4 shows the curve of versus a and versus 
L« for the three filters with Vs,saf = 3 and 7 = 0, 1.4, 2.2. 
It is remarkable that for all the cases the maximum of is 
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Figure 5. The curvature distribution given a threshold i/ = Vs 
of maxima of the background (thin lines) and of the background 
plus source (thick Unes) filtered with the SAF, MF and MHW 
are shown in the top and bottom left panels. The amplitude of 
the source corresponds to a threshold i^s^sAP = 3 and the spectral 
index of the background is 7 = 1.4. The bottom right panel shows 
the acceptance region for the same case: those maxima with u 
and K above the line are accepted as sources and those below are 
rejected. Solid, dotted and dashed lines correspond to the SAF, 
MF and MHW respectively. 




0.5 1 1.5 2 2.5 

7 

Figure 7. Relative difference in the true to spurious ratio of the 
SAF (sohd line) and MHW (dashed line) with respect to the MF 
(dotted line) in per cent. Top and bottom panels correspond to a 
source amplitude of i^s,sAP = 1 and i^s,sAF = 5 respectively. The 
acceptance region has been obtained in both cases using L* = 1. 



found at L ~ 1 within a few per cent. This is also true inde- 
pendently of the source ampHtude (for the studied range of 
i^3,SAF from 1 to 5). Therefore the chosen criterion indicates 
that we are in the best conditions to discriminate between 
the two hypothesis by using a simple acceptance region: the 
candidates are accepted as detections if the probability of 
having background plus source is greater or equal than hav- 
ing only background. The value of a that maximizes the 
significance depends mainly on the amplitude of the source 
and ranges from ~ 0.3 for i^s.SAF = 1 to ~ 0.01 for the 
highest amplitude considered, fa.sAF = 5. This simply re- 
flects the fact that the higher the amplitude of the source, 
the lower the number of spurious detections and vice versa. 
Taking into account the previous results we have calculated 
the acceptance region using L* = 1 for all the cases studied 
in this section. 

In the bottom right panel of Fig. 5 the acceptance re- 
gion for the case 7 = 1.4 and Vs.saf = 3 is given for the 
different filters. Those maxima with curvature and ampli- 
tude above the line are accepted as point sources, while 
those below are rejected. The slopes of the lines are easily 
understood by looking at the other three panels of the fi- 
gure, that show the curvature distribution given a threshold 
v = Vs of maxima of the filtered background (thin line) and 
of the background plus source (thick line) . For the MF both 
distributions are identical, i.e., we can not differentiate be- 
tween true and spurious detections based on their curvature. 
Therefore, the acceptance region is fixed by the amplifica- 
tion of the filter: if the maximum is found above a certain 
threshold it is accepted, otherwise is rejected. For the SAF, 
the distribution of curvature is shifted to higher values when 
a source is present. Thus, maxima with larger curvature are 
accepted at lower thresholds than those with smaller curva- 
ture, since the latter are more likely to be produced by the 



background. On the contrary, when the field is filtered with 
the MHW, the maxima produced by the source tend to have 
a smaller curvature than those of the background and the 
slope of the curve changes with respect to the SAF case. 

Once i?, has been obtained, we can calculate the density 
number of spurious sources (corresponding to background 
fluctuations) nl and real detections n* by integrating eqns. 
(1,5) in the region of acceptance: 



rib 



2 



erfc 



■ pV* 



\/2(l - pvs) 



+V2ays 



•erfc 



1 - PVs 



2(1 - 2pys + yl) ' 



(28) 



m / 
2 Jo 



dKKe 



\/2(l - pvs 



■i(«-J/a-3)%j.fc(«), 

(1 - pysf 



if* — ysK 



1 



(29) 



The number densities nl,n* , the ratio r = n* /nl, a and 1 — 
(5 have been plotted in Fig. 6 versus the spectral index 7 for 
a source amplitude corresponding to i^s.saf = 3 for the SAF, 
MF and MHW. In order to compare the performance of the 
different filters, we have also plotted the relative difference 
D of the detection ratio with respect to the MF, which is 
defined as 



D 



fMF 



VMF 



X 100 



(30) 



where the subindex i refers to the different filters. Therefore 
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Figure 6. Simple detection results for a source amplitude corresponding to !^s_saf = 3 and an acceptance region obtained fixing L, = 1. 
Solid, dotted and short-dashed lines refer to the SAF, MF and MHW respectively. The different panels correspond to the density number 
of detections n* (top left), the density number of spurious sources (top right), the ratio r of true to spurious detections (middle left), 
the relative difference D in the ratio with respect to the MF in per cent (middle right) , the probability of identifying correctly a detection, 
given a maximum in the position of the source 1 — /3 (bottom left) and the probability of misidentifying a peaJc of the background as a 
source a (bottom right). For comparison, the relative difference in the ratio obtained for the MHWo (long-dashed line) is also given in 
the middle right panel. 



a positive value of D indicates that the corresponding filter 
has a better detection ratio than the MF. Three different 
regions can be seen in relation to the performance of the 
filters. For 7 ~ — 1 the MF clearly outperforms the SAF 
and the MHW. The SAF works bettor than the other two 
filters in the range 7 ~ 1 — 1.6. Finally, at the highest values 
of 7 the MHW has the best performance of the three con- 
sidered filters, although it is only slightly bettor than the 
MF. This behaviour is qualitatively similar for other source 
amplitudes, although the performance of the MF in compar- 
ison with the other filters is improved for bigger amplitudes 
and gets worse for smaller ones (see Fig. 7). 

Although a detailed comparison with the MHWo is be- 
yond the scope of this work, we have also plotted the de- 
tection ratio relative to the MF obtained by the MHWo for 
the case j^s.saf = 3 using the region of acceptance defined 
by L > 1 (see middle right panel of Fig. 6). The MHWo 
clearly outperforms the MHW at the scale of the source in 



the considered range of spectral indexes. In fact, this filter 
gives the best detection ratio for high values of 7. 

5.2 Simple measurement 

In this case, the likelihood ratio is given by 



L(i', k) = 



V2 



(31) 



As in the previous case, we need to maximize the significance 
with respect to L, in order to get the acceptance region 
L > Lf The values of L, that maximize the significance are 
plotted versus the spectral index in Fig. 8 for i^c,saf = 5. In 
this case, the values of L, are in the range of ~ 1.5 to 1.8 
depending on the filter and spectral index. It also depends 
on the assumed value of Vc, higher values of this parameter 
lead to larger values of L* . This has an effect on the values 
of L» for the different filters, since the same amplitude of 
the source translates into different thresholds in the filtered 
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Figure 8. Value of L» that maximizes the significance versus the 
spectral index for the simple measurement case. Solid, dotted and 
dashed lines correspond to the SAF, MF and MHW respectively. 
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K 

Figure 9. Acceptance region for the simple measurement case 
for three different spectral indexes: (top), 1.4 (middle) and 2.2 
(bottom). Different lines correspond to the three considered fil- 
ters: SAF (solid), MF (dotted) and MHW (dashed). 



maps and therefore we have a different Vc for each filter. 
Thus the relative difference in the value of i, between filters 
seen on Fig. 8 are closely related to the relative amplification 
(see Fig. 3). The corresponding acceptance regions for 7 = 
0, 1.4, 2.2 are given in Fig. 9. 

The density number of spurious sources (correspond- 
ing to background fluctuations) nl and real detections 
n* are obtained by integration of nb^VjK) and n{i>,K) = 
Jq" dvs n{v, k\vs) in the region of acceptance R* respec- 
tively. 

The quantities nl, n* , n* /nl, D, a and I — (3 are given 
in Fig. 10 as a function of the spectral index. The accep- 
tance region has been obtained using the value of L* (given 
in Fig. 8) obtained from the maximization of the signifi- 
cance in each case. Wc sec that the situation is qualitatively 
similar to the simple detection case regarding the relative 
difference in the detection ratio. Again the MF outperforms 
the other filters at low 7, the SAF has the best behaviour at 



intermediate values whereas the MHW gives a higher ratio 
at large 7. 

On the other hand, a minimization of the likelihood 
ratio L(i^, k|z^s), given by eqn. (24), allows an estimation of 
the amplitude of the source Us to be obtained as a function 
of the data {v,k). The estimation of the amplitude is then 
given as the solution of the equation 

B ( KmI^^ 

{yl + ^,)v,-^+- \ = 0, (32) 

where the function B is given by eqn. (7) and and by 
eqn. (25). A confidence level for the estimation of the am- 
plitude of the source can also be obtained from eqn. (24). 



6 NUMERICAL SIMULATIONS: RESULTS 

Let us consider how the ideas presented in the previous sec- 
tions apply to real cases. In order to do this, we sinmlatcd a 
set of one-dimensional 'images' containing a Gaussian back- 
ground characterised by a power spectrum P{q) oc Two 
different experiments have been carried out. In the first set 
of simulations Gaussian sources of known width and ampli- 
tude were introduced. This case corresponds with the sim- 
ple detection scheme presented in section 3.1, in which the 
knowledge of the amplitude of the source (or, equivalently, 
its normalized amplitude Vs) can be used to determine the 
region of acceptance. In the second set of simulations the 
Gaussian sources had amplitudes that were drawn from a 
uniform distribution between a minimum and a maximum 
value. These simulations wore used to test the simple mea- 
surement scenario, in which the question is not only whether 
the source is detected or not, but also if its amplitude can 
be estimated as well. 

6.1 Simulations in the simple detection scenario 

In order to test the performance of different filters in the 
simple detection scenario, we simulated a set of 'images' 
with N = 1024 pixels each and a Gaussian background 
characterised by a power spectrum P{q) oc q ''. For each 
simulated image, a source with a Gaussian profile of width 
FWHM = 5 pixels and known amplitude Ao was placed 
at the central pixel. Then, the image was filtered with a 
matched filter, a scale-adaptive filter and a Mexican Hat 
wavelet. The matched and scale-adaptive filters were chosen 
taking into account the source width and the background in- 
dex as in eqns. (15) and (21), whereas the Mexican Hat had 
the same width than the Gaussian profile of the source. The 
normalized amplitude of the filtered source is z/^.x = Ao/ax, 
where X refers to the filter in consideration (SAF, MF or 
MHW). After filtering, wo look for a maximum at the posi- 
tion of the source we introduced. Here we implicitly assume 
that we know the exact position of the source, which is true 
for our simulations but will not be the case in a realistic 
observation. Fixing the position of the source allows us to 
clearly illustrate the behaviour of the optimal statistics de- 
scribed above. A short comment on position uncertainties 
will be presented in section . If there exists such a maxi- 
mum, we measure its normalized amplitude v and curvature 
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Figure 10. Results for the simple measurement case. The acceptance region has been obtained maximizing the significance with respect 
to L*. The different lines and panels are the same as those in figure 6. 



K. With these quantities and the previously known Vs it is 
easy to determine whether the maximum lies inside a cer- 
tain region of acceptance Rt or not. We adopt the criterion 
CI introduced in section 3.1: R* : L{v, k) > L^, and choose 
L« = 1 (for a justification for this particular value see sec- 
tion 5). If the maximum is in the region of acceptance, we 
count it as a valid detection and, if not, as a rejection (a 
false dismissal). The number of rejections is directly related 
with the probability (5 of eqn. (8). 

On the other hand, let us consider a simulation with 
the same background, but without any source, and filter 
it in the same way as before. Now, let us repeat the same 
procedure than before: we look for a local maximum at the 
central pixel of the 'image' and, in case it exists, we apply 
the selection mechanism, that is, make the (wrong) hypoth- 
esis that a source of normalized amplitude i^g is present in 
the simulation, and determine the region of acceptance as 
before. If the max:imum is not inside the region of accep- 
tance, it has been safely rejected. On the contrary, if it lies 
inside the region of acceptance, the noise will be interpreted 
as a signal and a false detection will occur. The number of 
false detections is directly related with the probability a of 
eqn. (8). 



In order to test the behaviour of the probabihtios of 
detecting a signal, rejecting a signal and obtaining a false 
detection as functions of the source amplitude Ao and the 
filter, we tested 10 different values of Ag. Three different 
background regimes were tested: background index 7 = 0, 
corresponding to a case in which the MF is expected to de- 
tect with better reliability than the other two filters, 7 — 1.4, 
corresponding to a case in which the SAP is the most relia- 
ble, and 7 = 2.2, where the most reliable filter is the MHW 
(see section 5). The lower and higher amplitude limits A^^^ 
and were chosen so that after filtering with the SAF 

the normalized amplitudes are i'^saf — 1 and i'^saf — 5. 
For each value of Ao we performed 10000 simulations con- 
taining a source in the central pixel and other 10000 sim- 
ulations containing only background. Each simulation was 
fihercd using the SAF, the MF and the MHW. For each 
pair of filtered simulations (one with source and one with- 
out it) the following quantities were recorded: the number of 
times that a source was properly detected Nd, the number 
of times a source produced a maximum but it was rejected 
by the selection criterion Nr and the number of spurious 
detections (due to the background) that happened in the 
simulations without source, Ni,. The total number of max- 
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ima found when a source is present is Nt = Nd+Nr- Another 
interesting quantity is the ratio r = Nd/Nt, that gives us an 
idea of the practical reliability of the detection. 

6.1.1 The case ■y = 

The results of the simulations with 7 = are shown in 
figure 11. The results of the simulations, indicated with 
points, are compared with the theoretical curves obtained 
in section 5. Solid lines and filled circles make reference to 
the scale-adaptive filter. Dotted lines and open circles make 
reference to the matched filter. Dashed lines and asterisks 
make reference to the Mexican Hat wavelet. 

The first panel on the top and the left of figure 11 shows 
the number of detections Na (local maxima due to sources 
that are inside the region of acceptance) in the 10000 sim- 
ulations for the three filters. Nd is directly related with the 
rmmbor density of maxima n* , being the relationship be- 
tween the two quantities dependent on the sizes of the pixel 
and the filter scale R. Since the matched filter has the great- 
est gain, it gives the highest number of detections. The num- 
ber of detections found in the simulations is lower than the 
expected value. This result is not unexpected due to the fol- 
lowing reason: the theoretical expected number of detections 
is calculated by multiplying the number density of maxima 
in the presence of a local source n* by the number of sim- 
ulations (10000 in this case) and the pixel size (in units of 
the filter width 7i). This product can, eventually, give more 
than one theoretical detection per simulation and pixel, if 
the pixel size and are big enough. Obviously, in a real 
simulation one can only find zero or one maxima per pixel. 
This effect is more conspicuous for high and, as we will 
see later, for higher 7 values, producing apparent paradoxes 
such as the prediction, in some cases, of a number of de- 
tections greater than the number of simulations we have. 
Therefore, the theoretical curve must be considered in the 
best case only as a upper limit for the real number of detec- 
tions. 

The panel on the top and the right of figure 11 shows the 
number of spurious detections Nb found in the 10000 simu- 
lations versus the expected value. Again, Nt is directly pro- 
portional to the number density of maxima due to the back- 
ground nl . The dots were obtained by counting the number 
of accepted background maxima considering all the pixels 
in absence of local source and then computing the average 
number of spurious detections per pixel, and the statistical 
la error bars of such average. The agreement between the 
expected values and the results from the simulations is re- 
markable. The matched filter gives the lowest gross number 
of spurious detections. 

The left and the right lower panels in figure 11 show 
the ratio between the number of background maxima inter- 
preted as source (Nb) and the total number of maxima due 
to the background, and the ratio between source detections 
(Nd) and the total number of maxima in presence of a source 
{Nt = Nd + Nr), respectively. The first ratio gives us an es- 
timation of the false alarm probability a whereas the second 
one gives the power of the detection 1 — /3. The results ob- 
tained with the SAF and the MF arc strikingly similar to the 
expected values, whereas the concordance is slightly worse 
in the case of the MHW. Note that the problem of the the- 
oretical curves mentioned above does not affect these ratios 



because the terms that come from the number of simulations 
and the pixel area cancel in the division. 

More interesting is the ratio between the number of ef- 
fective detections and the number of spurious ones. This 
quantity is shown in the middle panel on the left of figure 11. 
Since the proportionality between Nd and n* is the same 
than the one between Nb and nl, this ratio is a good esti- 
mator of the ratio r = n* /nl. The inverse of this ratio gives 
the percentage of spurious detections that is found in a num- 
ber of detected maxima. Again, the agreement between 
theory and simulations is excellent. Note that the greatest 
values of r = Nd/Nb correspond to the matched filter. That 
means that, if what we are looking for is reliability in the 
detection, the matched filter is the best choice in this case 
7 = 0. 

In order to quantify the differences between filters re- 
garding reliability (in the sense mentioned above), it is useful 
the quantity D defined in equation (30). This D can be un- 
derstood as the per cent relative difi^erence between filters. 
In the middle panel on the right of figure 11 it is shown the 
value of D. The differences between the MF and the other 
filters are ~ 50% for the SAF and ~ 80% for the MHW at 
intermediate i^s values. 

Gaussian backgrounds with 7 = correspond to uncor- 
related (white) noise and are very common in many fields of 
physics. For instance, the background of many astronomical 
images is dominated by uncorrelated Poissonian noise, which 
is well approximated by this kind of noise when averaging 
over many observations. 



6.1.2 The case ^ = 1.4 

The results of the simulations with 7 = 1.4 are shown in 

figure 12. As in the case before, the results from the sim- 
ulations are compared with the theoretical curves obtained 
in section 5. The different lines and panels are the same as 
those in figure 11. The agreement between the theoretical 
expectations and the results is very good. 

The number of detections Nd at a given normalized am- 
plitude Vs is higher for the three filters than in the case 
7 = 0. The filtering is more efficient at enhancing sources 
when 7 increases due to the fact that at high 7 the power of 
the background concentrates in large scale structures that 
are more easily removed by the filters. 

In this case the performance of the three filters is similar 
in number of detections as well as in the probabilities a 
and 1 — 0. The SAF, however, produces a lower number of 
spurious detections due to the background. Therefore, the 
ratio r between authentic and spurious detections is better 
for the case of the SAF (although the MF performs almost as 
well in this aspect). The relative difference D between SAF 
and MF in this case takes values around 10%. The MHW 
performs significantly worse than the other two filters in this 
case. 

We remark that noise index in the interval 1 < 7 < 1.6 
are not rare in some areas of Astronomy. For example, in 
the scanning of the sky of many CMB experiments (MAP, 
Planck, etc) backgrounds with noises in this range may ap- 
pear due to the combination of CMB and Galactic fore- 
grounds with the scanning 1/f noise. 
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Figure 11. Simple detection results for the case 7 = 0. Solid lines and filled circles make reference to results obtained with the scale- 
adaptive filter. Dotted lines and open circles make reference to results obtained with the matched filter. Dashed lines and asterisks make 
reference to results obtained with the Mexican Hat wavelet. The different panels are explained in the text. 



6.1.3 The case ^ = 2.2 

The results of the simulations with 7 = 2.2 are shown in 
figure 13. As in the two cases before, the results from the 
simulations arc compared with the expectations obtained in 
section 5. The different lines and panels are the same as 
those in figure 11. 

This case corresponds with the region in which the 
MHW outperforms the other two in reliability. The matched 
filter, however, performs almost as well as the MHW. The 
relative differences D in the detection ratio are of a few per- 
cent between the MHW and the MF. The SAF performs 
significantly worse in this case. 

The agreement between the results of the simulations 
and the theoretical predictions is good again, though not so 
good as in the previous cases. The reason is that in practice 
it is not possible to simulate perfectly a P{q) oc q~~' back- 
ground with 7 > because such backgrounds diverge when 
q tends to 0. The discrepancy between the ideal power spec- 
trum and the simulated one gets worse when 7 increases. 

Spectral indices around 2 correspond to smooth back- 
grounds and are less common than smaller indices. However, 
they appear in certain cases. For instance, microwave obser- 
vations dominated by dust emission have 7 ~ 2. 



6.1.4 Determination of the position 

In the simulations presented above, our aim was reproducing 
the theoretical scheme using simulations. Therefore, we have 
focused only on what happens in one pixel of the image, the 
pixel in which the source is located. However, in most real 
situations, the position of the source is not known and we 
will need to consider all the pixels of the image. 

Note that we are assuming that the position of the max- 
imum is a good estimator of the position of the source. In 
practice, this is not necossaraly true. In particular, the max- 
imum can be shifted from the position of the source due to 
background fluctuations, although the probability of find- 
ing a peak due to source plus background decreases quickly 
as the distance to the source position increases. We have 
calculated the probability of a peak appearing at a certain 
distance dp (in pixel units) from the position of the sim- 
ulated source for the case 7 = 1.4. We expect the results 
to be qualitatively similar for the other cases. At fs = 5.0 
we found that the probability of the peak appearing at dis- 
tance dp = 1 is approximately 0.2 for the three considered 
filters. At Vs = 3.0 this probability increases to 0.4, whereas 
at Vs = 1.0 it increases to 0.5. When we consider dp = 2 
pixels we found these probabilities to be practicaly zero for 
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Figure 12. Simple detection results for the case 7 = 1.4. The different lines and panels are the same as those in figure 11. 



cases Vs = 5 and i^s = 3, and 0.2 for Vg = 1. For dp > 3 
the probability is almost zero for aJl cases. Therefore, in a 
realistic situation where we need to consider all the pixels 
of the image, and in particular the neighbouring pixels of 
the source, the number of detections will slightly increase 
(in a very similar manner for all the considered filters). Re- 
garding the number of spureous sources, they do not depend 
on the pixel position. Thus the spureous detections will in- 
crease linearly with the considered number of pixels. This 
behaviour is again the same for all the filters. Therefore the 
main conclusion is that while the global performance of the 
filters will be worse in a realistic case in which the position 
of the sources is not known, the relative performance of the 
filters will still be the same as that described in the previous 
sections. 

Note that in our simulations the FWHM of the beam 

is 5 times the pixel size and that basically all maxima pro- 
duced by a point source fall within the beam size. Therefore, 
by simply using the position of the maximum as an estima- 
tor of the position of the source, this last quantity can be 
determined with a precision given by the beam size of the 
experiment. More ellaboratcd techniques can be used to im- 
prove the determination of the position. For instance one 
could use the average of the points at half the peak inten- 
sity along the slopes of the image, or fitting these points to a 



certain profile. This is out of the scope of the present paper 
and remains for a future work. 

6.2 Simulations in the simple measurement 
scenctrio 

Now consider that the true amplitude of the source is un- 
known. This is the most frequent case in practise. In such 
a case, one must both detect and estimate the amplitude of 
the sources that are supposed to be embedded in the data. 
This problem can be confronted by adopting the methodol- 
ogy considered at the end of section 3.2: in absence of any a 
priori information about the p.d.f. p{A) we assume unifor- 
mity in the interval [0,i^c], where Uc is some cut threshold 
over which it is not expected to find sources. An approximate 
value of Vc can be obtained directly from the data under 
consideration simply by adopting Vc — {\Imax\ + \Imin\) /cr, 
where Imax and Imin are the maximum and minimum data 
values, respectively. In practise, it is very unlikely that a 
source with normalized amplitude greater than the previous 
value is present in the data. If the p.d.f. p{A) is known, that 
information can be used instead of the previous uniform dis- 
tribution. As explained in section 3.2, using the p.d.f. p{A) 
it is possible to determine a region of acceptance for the de- 
tections and the normalized likelihood L{i',k\us). By means 
of the normalized likelihood one can estimate the most prob- 
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Figure 13. Simple detection results for the case 7 = 2.2. The different lines and panels are the same as those in figure 11. 



able value and confidence limits ('error bars') for the ampli- 
tude A of any detected peak. 

To test this methodology a set of simulations similar to 
the ones described in section 6.1 were performed. Half of the 
simulations had a Gaussian source of FWHM = 5 pixels lo- 
cated at the central pixel, whereas the other half had only 
noisy background. Now we try to detect sources in both kind 
of simulations without knowing a priori the amplitude of the 
(hypothetical) source. In each case, a region of acceptance is 
determined as in eqns. (13, 14). If a peak is found and it lies 
inside the region of acceptance, its most probable amplitude 
is estimated by looking for the maximum value of the nor- 
malized likelihood L{i',k\i's) and the 68% confidence limits 
('error bars') are calculated using the normalized likelihood 
L(i>,k\i's). The number of 'detections' found in the simu- 
lations without source indicates the probability of spurious 
detection in this scenario. 

The amplitude of the simulated sources takes values be- 
tween and a maximum value Amax so that Amax / s af) ~ 
5, being (asAp) the average value of ao over 1000 simula- 
tions filtered with the SAF. We take this value as the value 
of Vc for the calculations. For the sake of simplicity and with- 
out any loss of generality, we consider 25 amplitude bins in 
the interval [0, Vc] instead of a continuous sampling. A num- 
ber of 5000 simulations with source and 5000 simulations 
without source were performed for every amplitude bin. As 



in the previous section, three cases were considered: back- 
ground indexes 7 = 0, 7 = 1.4 and 7 = 2.2. 



6.2.1 The case 'y = 

In the simple measurement scenario we need to take two 
Eispects into account. On the one hand, there is the pure de- 
tection aspect of the process, that is, how many sources are 
detected, and how many spurious detections due to back- 
ground are accepted by our decision criterion. On the other 
hand, there is the issue of how well the parameters of the 
sources (i.e. their amplitude) are estimated. 

The three first rows of table 1 show the relevant results 
for the simulations with 7 = and the three filters. The the- 
oretical expectation for the number of simulations we per- 
formed is shown in parenthesis. The number of detections 
is slightly lower in the simulations than in the theoretical 
prediction, as we have seen in section 6.1, and for the same 
reasons that were explained there. The number of spurious 
detections Ni, was slightly higher than the theoretical ex- 
pectation. The lower number of detections and higher num- 
ber of spurious sources lead to ratios r = Nd/N,. that arc 
worse than in the theoretical case. The diS'erence, however, 
is never greater than a 20%, and the theoretical behaviour 
is preserved: the MF gives the best ratio r, followed by the 
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Figure 14. Determination of the amplitude in the case of simple 
measurement and 7 = 0. The estimates using eqn. (32 (filled 
circles) and using the measured intensity (open circles) are shown 
for the three filters. The lower right panel shows the comparison 
between the three filters (filled circles for SAF, open circles for 
MF, asterisks for MHW). 



SAF and then by the MHW. The agreement in the power of 

1 — /3 of the filters is very good for the three filters. 

Regarding the estimation of the amplitudes, the results 
are shown in figure 14. The filled circles show the mean 
value of the estimated normalized amplitude as a function 
of the true normalized amplitude of the source. The esti- 
mation has been performed using eqn. (32). As a reference, 
the mean values of the directly measured amplitudes are 
shown by the open circles. Clearly, the estimation based in 
the maximization of the likelihood L{i',k\i's) is more accu- 
rate than the naive estimation using directly the measured 
values of the maxima. Error bars show the dispersion of the 
estimates around the mean value in the simulations (solid 
line for the estimation using eqn. (32) and dotted line for 
the naive estimation). 

For high amplitudes the estimation of the amplitude is 
unbiased for the three filters. At low amplitudes there is a 
positive bi£is due to a selection effect: the sources are very 
faint and only in the cases when a background fluctuation 
enhances by chance the source the associated maximum is 
able to enter in the acceptance region Using eqn. (32) 
we are able to estimate reasonably well amplitudes as low 
as 2.5crsAF- 

The lower panel on the right of figure 14 show the mean 

values of the estimated amplitudes using eqn. (32) for com- 
parison between the three filters. SAF is denoted by filled 
circles, whereas open circles and asterisks stand for MF and 
MHW, respectively. The three filters show a very similar ef- 
ficiency at high amplitudes, whereas at low amplitudes the 
MF and the SAF work better than the MHW. 

As noted in section 3.2, using the normalized likeli- 
hood L{u,it\us) it is possible to give a confidence interval 
for the estimation of the amplitude. To illustrate this point, 
figure 15 shows the estimated amplitude and the 68% confi- 
dence intervals calculated using eqn. (24) for the three filters 
in two different cases: when the source is in the limit of de- 
tection (left panels of the figure) and when the source has a 
good SNR after filtering (right). Only 30 examples were ran- 
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Figure 15. Some examples of estimation of the amplitude with 
its error bars calculated using eqns. (32) and (24) for the case 
7 = 0. In each panel 30 randomly chosen detections are shown 
(open circles) with their 68% confidence intervals. The three pan- 
els on the left show detections for the source amplitude bin num- 
ber 10 (correspondent to Us ~ 2) whereas the panels on the right 
show detections for the source amplitude bin number 25 (us ~ 5), 
for the three filters. The horizontal dotted line shows the true am- 
plitude of the sources. As comparison, the mean value of the es- 
timated amplitude for all the detections in the same bin is shown 
on the right of each panel (filled circle) with its la statistical error 
bar. 



domly chosen for each filter and amplitude bin in order to 
make the plot readable. For comparison, the mean value and 
statistical error bar that appears in figure 14 is shown at the 
right of each plot. The confidence intervals are significantly 
larger than the statistical error bars. Hence, they must be 
considered as upper limits to the true error. However, in a 
realistic case they are the only thing that one can safely 
say about the error distribution without having to resort to 
simulations. 



6.2.2 The case 7 = 1.4 

The rows 4 to 6 in table 1 shows the results for the simple 
measurement scenario and 7 = 1.4. As expected, the to- 
tal number of detections Nd has increased with respect to 
the case 7 = 0, while the number of spurious detections Ni, 
has decreased. The relative difference between both num- 
bers and their theoretical expectations are below the 13% in 
all the cases. The lowest number of spurious detections cor- 
responds to the SAF. Moreover, the best ratio r = Nd/Ni, 
corresponds to the SAF as well. The differences between the 
predicted r and the ratio given by the simulations is below 
the 3% for the SAF and the MF, and below the 6% in the 
case of the MHW. The agreement in the probability 1 — /3 
is very good for the three cases. 

Figure 16 shows the results of the estimation of the am- 
plitude in the case 7 = 1.4. The meaning of the panels, 
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Table 1. Results from the simulations in the simple measurement scenario, compared with the theoretical expectations (in parenthesis). 
Three background ('noise') regimes are considered: 7 = 0, 7 = 1.4 and 7 = 2.2. For each value of 7, three filters have been studied: 
the scale-adaptive filter (SAF), the matched filter (MF) and the Mexican Hat wavelet (MflW). The third column shows the number of 
correct detections from the total of simulations performed (125000). The fourth column shows the number of spurious detections (due 
to background fluctuations in absence of sources) per pixel from the total number of simulations performed. The fifth column shows the 
power of the detection (i.e. /? is the probability of interpreting signal as noise). The sixth column shows the ratio between the number of 
true and spurious detections, on the basis of a probability of 50% of presence of a source in the pixel considered. 





Filter 




Nt 


1-/3 


r = Na/Nk 


7 = 


SAF 
MF 
MHW 


37863 (39914) 
41199 (45154) 
33742 (34616) 


912 (818) 
498 (496) 
1715 (1436) 


0.84 (0.83) 
0.85 (0.85) 
0.79 (0.76) 


41.52 (48.79) 
82.73 (91.04) 
19.67 (24.11) 


7 = 1.4 


SAF 
MF 
MHW 


46668 (52277) 
48575 (55279) 
48266 (54455) 


630 (718) 
768 (848) 
925 (983) 


0.80 (0.81) 
0.81 (0.82) 
0.81 (0.81) 


74.08 (72.81) 
63.25 (65.19) 
52.18 (55.40) 


7 = 2.2 


SAF 
MF 
MHW 


51396 (49863) 
60758 (70056) 
60077 (70058) 


744 (824) 
721 (795) 
697 (765) 


0.82 (0.81) 
0.83 (0.83) 
0.82 (0.83) 


69.08 (60.51) 
84.26 (88.12) 
86.19 (91.58) 




Figure 16. Determination of the amplitude in the case of simple 
measurement and 7 = 1.4. The estimates using eqn. (32) (filled 
circles) and using the measured intensity (open circles) are shown 
for the three filters. The lower right panel shows the comparison 
between the three filters (filled circles for SAF, open circles for 
MF, asterisks for MHW). 



Figure 17. Determination of the amplitude in the case of simple 
measurement and 7 = 2.2. The estimates using eqn. (32) (filled 
circles) and using the measured intensity (open circles) are shown 
for the three filters. The lower right panel shows the comparison 
between the three filters (filled circles for SAF, open circles for 
MF, asterisks for MHW). 



points and lines is the same than in figure 14. Again, the 
estimation of the amplitude using eqn. (32) is better than 
the simple estimation using directly the measured intensity 
of the maxima. The three filters perform very similarly in 
this aspect. As in the previous case, the effective limit for 
a good estimation of the amplitude is around i^s.sAP — 2.5. 
We would like to remark, however, that since the filters pro- 
duce greater amplification in this case, the absolute (no nor- 
malised) amplitude one can detect and correctly estimate is 
in this case lower than in the case 7 = 0. 

The estimation of confidence intervals works exactly in 
the same way as before. 



6.2.3 The case ^ = 2.2 

The rows 7 to 9 in table 1 shows the results for the simple 
measurement scenario and 7 = 2.2. The total number of 
detections Nd is greater than in previous cases, that is, the 
amplification (gain) of the filters is greater than for lower 7 
values. In the same way, the number of spurious detections 
Ni, is smaller than before. The relative difi^erence between 
both numbers and their theoretical expectations arc below 
the 15% in all the cases. As expected, the lowest number of 
spurious detections and the best ratio r correspond to the 
MHW (but the MF performs almost equally well). Again, 
there is a remarkable agreement in the probability 1 — /3 
between simulations and theoretical expectations. 

Figure 17 shows the results of the estimation of the 
amplitude in the case 7 = 2.2. There are not qualitative 
differences with respect to the previous two cases. Note once 
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more that having the same normahzed amphtude v than 
in the cases 7 = 0, 1.4 does not mean having the same 
amplitude at the beginning. As the gain of the filter grows, 
we can reach fainter sources. 



7 CONCLUSIONS 

Filtering is a useful technique for the dcrioising and enhanc- 
ing of compact signals in Astronomy and many other ap- 
plications. However, the diversity of different kind of filters 
makes difficult to choose which one is the most convenient 
in any particular case. In this paper wc have addressed the 
problem of the choice of filter for the detection and measure- 
ment of point sources in a noisy background. We focus on 
the one-dimensional case and assume that the sources have 
a Gaussian spatial profile and the noise can be modelled by 
an isotropic and homogeneous Gaussian random field. Then 
we compare three different filters that have been thoroughly 
used in the study of Cosmic Microwave Background radiar 
tion data: the Mexican Hat wavelet, the matched filter and 
the scale-adaptive filter. Although we have focused in these 
throe filters, the methodology we describe can be easily ap- 
plied to any linear filter. 

Two main scenarios have been explored: the case in 
which one tries to detect a source of known amplitude but 
uncertain existence and location in the data (simple detec- 
tion) and the case in which one tries to detect sources and 
determine their a priori unknown amplitude [simple estima- 
tion). In both cases, local peak detection is the first step to 
be undertaken. In the one-dimensional case, any local peak 
is characterised by its intensity (amplitude) v and its curva- 
ture K. Once a peak is detected, a decision about its validity 
must be done. 

We use the Neyman-Pearson criterion to define a re- 
gion of acceptance in the (ly, n) space that maximizes the 
significance of the detections. The limits of such region will 
depend on the properties of the background (namely, its 
power spectrum) and the filter that has been applied to the 
data. 

In the simple measurement scenario, the Neyman- 
Pearson criterion is applied by calculating the a posteriori 
p.d.f. of all the possible values of the amplitude given the 
data (v, k). In absence of any a priori information about the 
p.d.f. of the amplitudes of the sources p{A), we assume uni- 
formity between and a certain cut Vc that can be easily 
guessed from the data. If we had a priori information about 
p{A), it could be straightforwardly included in the formal- 
ism. Once the source is detected, its amplitude A can be 
estimated by maximizing the likelihood L{u,it\us) with re- 
spect to Vs ■ Convenient confidence intervals can be given for 
any detection using the previous normalized distribution. 

We have compared the MHW, the SAP and the MF 
in the two scenarios presented above. Both analytical and 
empirical comparisons have been performed considering 
a generic background characterised by a power spectrum 
P{q) oc q~'^ . In the analytical comparison we have shown 
how to derive useful formulae to predict the number density 
of local peaks that lie inside the Neyman-Pearson region 
of acceptance both in presence and in absence of source. 
With these quantities it is possible to compute the ratio be- 
tween the number of true detected sources and the number 



of spurious detections, that is, a measure of the reliability 
of each filter in the task of detecting sources. We find that, 
regarding this last quantity, there are three different regions 
demarcated by the value of the index 7: for < 7 < 1 the 
MF outperforms the other two in the reliability sense as well 
as in total number of true detections. For 7 = 1 the MF and 
the SAF coincide. For 1 < 7 ^ 1.6 the SAF gives the best 
reliability. The relative difference with the MF is greater 
than a 10%. For 7 = 2 the MF and the MHW coincide. 
Finally, for 7 ^ 1.6 the MHW is the most reliable filter for 
the detection, although the matched filter performs almost 
equally well. However, the performance of the MHW can be 
improved by using it with an optimal scale (that can be es- 
timated from the data) which gives maxinmm amplification 
of the source. These three regions are present and their lim- 
its are roughly the same both in the simple detection and in 
the simple measurement scenarios. 

We have performed exhaustive simulations in order to 
test the previous ideas. The cases 7 = 0, 1.4 and 2.2 have 
been chosen with the aim of exploring the three regions de- 
scribed above. The results of the simulations totally agree 
with the analytical expected behaviour. 

Regarding the estimation of sources of unknown ampli- 
tude, the three filters perform equally well. The estimation 
of the amplitude using the likelihood of the data is fairly bet- 
ter than the estimation using directly the measured intensity 
of the peaks. At low source amplitudes there is a positive 
bias due to a selection effect. With these filters and the like- 
lihood estimator, it is possible to safely reach thresholds as 
low as 2.5(TsAF in the filtered data. Due to the amplification 
effect of the filters, the equivalent thresholds in the unfil- 
tered maps can be really low, specially for high 7 indexes. 
As an example, in the case 7 = 2.2 the MF produced in 
our simulations a mean gain of 7.5, this meaning that the 
'safe threshold' of 2.5 after filtering translates into a 0.33 
threshold before filtering. 

The ideas presented in this paper can be generalized to 
more general filtering schemes and for two-dimensional data 
sets. In this case, other quantities such as the ellipticity of 
the peaks can be useful to establish decision criteria similar 
to the ones presented here. A future work will address this 
particular issue. 
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